Search CORE

2,014 research outputs found

Identifying Keystone Species in the Human Gut Microbiome from Metagenomic Timeseries using Sparse Linear Regression

Author: Fisher Charles K.
Mehta Pankaj
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Human associated microbial communities exert tremendous influence over human health and disease. With modern metagenomic sequencing methods it is possible to follow the relative abundance of microbes in a community over time. These microbial communities exhibit rich ecological dynamics and an important goal of microbial ecology is to infer the interactions between species from sequence data. Any algorithm for inferring species interactions must overcome three obstacles: 1) a correlation between the abundances of two species does not imply that those species are interacting, 2) the sum constraint on the relative abundances obtained from metagenomic studies makes it difficult to infer the parameters in timeseries models, and 3) errors due to experimental uncertainty, or mis-assignment of sequencing reads into operational taxonomic units, bias inferences of species interactions. Here we introduce an approach, Learning Interactions from MIcrobial Time Series (LIMITS), that overcomes these obstacles. LIMITS uses sparse linear regression with boostrap aggregation to infer a discrete-time Lotka-Volterra model for microbial dynamics. We tested LIMITS on synthetic data and showed that it could reliably infer the topology of the inter-species ecological interactions. We then used LIMITS to characterize the species interactions in the gut microbiomes of two individuals and found that the interaction networks varied significantly between individuals. Furthermore, we found that the interaction networks of the two individuals are dominated by distinct "keystone species", Bacteroides fragilis and Bacteroided stercosis, that have a disproportionate influence on the structure of the gut microbiome even though they are only found in moderate abundance. Based on our results, we hypothesize that the abundances of certain keystone species may be responsible for individuality in the human gut microbiome

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

FigShare

A Rule of Thumb for the Power Gain due to Covariate Adjustment in Randomized Controlled Trials with Continuous Outcomes

Author: Fisher Charles K.
Publication venue
Publication date: 09/08/2023
Field of study

Randomized Controlled Trials (RCTs) often adjust for baseline covariates in order to increase power. This technical note provides a short derivation of a simple rule of thumb for approximating the ratio of the power of an adjusted analysis to that of an unadjusted analysis. Specifically, if the unadjusted analysis is powered to approximately 80\%, then the ratio of the power of the adjusted analysis to the power of the unadjusted analysis is approximately

1 + \frac{1}{2} R^2

, where

R

is the correlation between the baseline covariate and the outcome

arXiv.org e-Print Archive

Can RBMs be trained with zero step contrastive divergence?

Author: Fisher Charles K.
Publication venue
Publication date: 03/11/2022
Field of study

Restricted Boltzmann Machines (RBMs) are probabilistic generative models that can be trained by maximum likelihood in principle, but are usually trained by an approximate algorithm called Contrastive Divergence (CD) in practice. In general, a CD-k algorithm estimates an average with respect to the model distribution using a sample obtained from a k-step Markov Chain Monte Carlo Algorithm (e.g., block Gibbs sampling) starting from some initial configuration. Choices of k typically vary from 1 to 100. This technical report explores if it's possible to leverage a simple approximate sampling algorithm with a modified version of CD in order to train an RBM with k=0. As usual, the method is illustrated on MNIST

arXiv.org e-Print Archive

Constructing ensembles for intrinsically disordered proteins

Author: Fisher Charles K.
Stultz Collin M.
Publication venue: 'Elsevier BV'
Publication date: 01/04/2011
Field of study

The relatively flat energy landscapes associated with intrinsically disordered proteins makes modeling these systems especially problematic. A comprehensive model for these proteins requires one to build an ensemble consisting of a finite collection of structures, and their corresponding relative stabilities, which adequately capture the range of accessible states of the protein. In this regard, methods that use computational techniques to interpret experimental data in terms of such ensembles are an essential part of the modeling process. In this review, we critically assess the advantages and limitations of current techniques and discuss new methods for the validation of these ensembles

DSpace@MIT

PubMed Central